Making a network from scratch

Outline

Understanding the purpose of error in a simple learning machine
Understanding network architecture
Building a three layered network:
- Linear Algebra
- How weights are used in the network
- How the network learns from error
Implementing the network

Simple learning machine

Let's assume a linear relationship between number of Uber employees in an elevator and the time it takes you to get to reach the 18th Floor.

$$30=5x$$

How do we find x?

One way of solving this problem, is to directly find x, where $x=6$ (i.e., for every uber employee in the elevator, you should expect to wait an extra 6 seconds)

Simple Learning Machine

A different way of solving this problem, is to pluck in a random value and see how wrong you are?

Let's use x=0.3

error = 30 - 5 * 0.3

error = 28.5

Simple Learning Machine

Let's multiply the x by 20; 0.03*20=0.6

error = 30 - 5 * 0.6

error = 27

This is till not great. We doubled the guessed value, but our error decreased only by about 5% (1-27/28.5). If we really want to make some progress, we need to increase the value by a lot more. We can use the change in the error to figure out by how much x should increase.

Things to remember:

We used error to modify the value of x.
We nudged the weight (x) based on how big our error was.
We don't use algebra to find the relationship between input and output. Often we will not know what that relationship is (i.e., linear, non-linear), or there might be hundreds of weights to find (making the problem algebraically intractable).

A simple neuron (source)

Mathematical Formulation of the Neuron

Activation Functions

Sigmoid Activation Function

$$y=\frac{1}{1+e^{-x}}$$

Can show saturation (e.g., this happens when the output of the neuron is a value very very close to 1 or 0. The neuron will effectively stop learning when that is the case.
Does not zero center the output. This is an issue for deep neural architectures, since it will make gradient learning a bit more flakey.

There are a lot more activation functions out there. See this answer on CrossValidated.

A simple 3 layer network

How signal travels through a network?

Input to the first neuron in layer 2 is: $$x=(1*0.4) + (-0.6*0.6)$$ $$x=0.04$$

Output is: $$y(x)=\frac{1}{1+e^{-0.04}}$$ $$y(x)= 0.5099$$

Input to the second neuron in layer 2 is: $$x=(1*0.9) + (-0.6*0.3)$$ $$x=0.72$$

Output is: $$y(x)=\frac{1}{1+e^{-0.04}}$$ $$y(x)= 0.6726$$

Brief Intro to Matrix Multiplication

$$\textbf{X}=\textbf{WI}$$

$$ W= \left[ {\begin{array}{cc} w_{1,1} & w_{2,1} \\ w_{1,2} & w_{2,2} \\ \end{array} } \right] $$

$$ I= \left[ {\begin{array}{cc} i_{1} \\ i_{2} \\ \end{array} } \right] $$



In [15]:

    
### matrix multiplication using numpy
import numpy as np
W = np.array([[0.4, 0.6], [0.9, 0.3]])
I = np.array([1, -0.6])
X=np.dot(W, I)
X









    Out[15]:





array([ 0.04,  0.72])

Where are we?

How do we use error when multiple weights contribute to it?

Updating Weights (aka, training)

The error is being split in fractions of relative error

$$Error_{w11} = e * \frac{W_{11}}{W_{11}+W_{21}}$$$$Error_{w21} = e * \frac{W_{21}}{W_{11}+W_{21}}$$

Backpropagating Errors

Using Matrix Multiplication to Update Weights

$$ error_{hidden}= \left[ {\begin{array}{cc} \frac{w_{1,1}}{w_{1,1}+w_{2,1}} & \frac{w_{1,2}}{w_{1,2}+w_{2,2}} \\ \frac{w_{2,1}}{w_{2,1}+w_{1,1}} & \frac{w_{2,2}}{w_{2,2}+w_{1,2}} \\ \end{array} } \right] \cdot \left[ {\begin{array}{cc} e_{1} \\ e_{2} \\ \end{array} } \right] $$

Although mathematically normalizing the weights is the right thing to do, in practice you can acctually ignore that. This makes matrix operations much simpler, with very little cost to the time it takes to train a network.

Using Matrix Multiplication to Update Weights

$$error_{hidden}= W^{T}_{hidden} * error_{output}$$

You've gotten pretty far, so you should take a breather.

What comes next is the hardest part of all, but we need to ge through it.

How do we acctually update weights?

Remember at the start of the talk we used the rate of change in the error rates between two iterations to adjust parameter to number of Uber employees? We'll, that's not a good way of doing it.

Normally we don't know what function we want to model (in that example we assumed a linear function with one parameter)
The reality is that the error likely has a complex landscape

Gradient Descent High Level Explanation

We train the network multiple times (i.e., epochs) to make sure it does not get stuck at a local minima.

Which error are we using for gradient descent?

$(target-actual)$
$|target-actual|$
$(target-actual)^2$

Which error are we using for gradient descent?

$$(target-actual)^2$$

The error function is smooth and continuous
The gradient gets smaller near the minimum, so we don't overshoot the objective
The math is easy enough with this error

You can make your own customized cost functions that include weights or that punish the model for making strong inaccurate predictions (i.e., logloss). See more here. You can find more about their implementation by examinining the keras source code.

Gradient Descent

To do gradient descent, we need to figure out how the slope of the error relates to the change in weights. For this, we are going to use calculus.

$$\frac{\partial E}{\partial w_{jk}}$$

Gradient Descent

$$\frac{\partial E}{\partial w_{jk}}=\frac{\partial (t_{k}-o_{k})^2}{\partial w_{jk}}$$

note that $t$ is a constant, and $o_{k}$ depends on the weights $w_{jk}$

Gradient Descent

We can rewrite $$\frac{\partial E}{\partial w_{jk}} = \frac{\partial E}{\partial o_{k}} * \frac{\partial o_{k}}{\partial w_{jk}}$$

Taking the derivative of the error, we get: $$\frac{\partial E}{\partial w_{jk}} = -2(t_{k}-o_{k}) * \frac{\partial o_{k}}{\partial w_{jk}}$$

$o_{k}$ also need to be differentiated. Remember that o{k} is equal with: $$S(\sum{1}^{j} w{jk}o{j})$$ where, $S$ is the sigmoid.

Fortunately the derivative of the sigmoid is fairly easy to compute: $$S(x) = S(x)(1-S(x))$$

Gradient Descent

$$\frac{\partial E}{\partial w_{jk}} = -2(t_{k}-o_{k}) * S(\sum_{1}^{j} w_{jk}o_{j}) * (1-S(\sum_{1}^{j} w_{jk}o_{j}))* \partial \frac{\sum_{1}^{j} w_{jk}o_{j}}{\partial w_{jk}}$$

We still have to take the derivative of the inner expression: $\sum_{1}^{j} w_{jk}o_{j}$

That comes to be $o_{j}$. Does anyone know why?

Our final expression comes to $$\frac{\partial E}{\partial w_{jk}} = -(t_{k}-o_{k}) * S(\sum_{1}^{j} w_{jk}o_{j}) * (1-S(\sum_{1}^{j} w_{jk}o_{j}))* o_{j}$$

Notice that we dropped the constant 2 because in the grand scheme of things it doesn't really channge much because, we we will see later, we use a learning rate parameter to adjust how fast the gradient should change.

Gradient descent

Let me rewquite the expression we just wrote in pseudo-english:

$$\text{partial change in error for each partial change in weights} = -(error_{output}) * sigmoid(output) * (1-sigmoid(output)) * o_{input}$$

Gradient descent (weight updates)

When we update the weights, we don't adjust it using the full value of the gradient. We instead adjust it using a learning error paremeter.

$$ w_{jk} = w_{jk} - \alpha \frac{\partial E}{\partial w_{jk}} $$

Example

$$\frac{\partial E}{\partial w_{jk}} = -(t_{k}-o_{k}) * S(\sum_{1}^{j} w_{jk}o_{j}) * (1-S(\sum_{1}^{j} w_{jk}o_{j}))* o_{j}$$

Note that the erros above are not really realistic. When we apply the sigmoid transformation, our values will never be larger than 1, so if you get errors larger than 1 with sigmoids, then maybe your outcome is not made out of 1/0 as it should be.
$-(t_{1}-o_{1})$ is equal to 1 for the first node. For the second node it is equal to 2.
This expression $\sum_{j}^{1} w_{jk}o_{j}$ inside the sigmoid is equal to $3*0.8+2*0.6 = 3.6$ for the first output node and $2*0.8+2*0.6 = 2.8$ for the second output node. These are the inputs for the final layer.
Taking the sigmoid is simple: $\frac{1}{1+e^{-3.6}} = 0.9734$ for the first node. For the second node, output will be $\frac{1}{1+e^{-}} = 0.9426758$
the last part $o_{j}$ represents the output from the previous layer. In this case it is 0.8 ad 0.6.



In [16]:

    
import numpy as np
import scipy.special
# create the neural network
class network:
    def __init__(self, inputnodes, hiddennodes, outputnodes, learning_rate):
        # set the number of nodes for each input, hidden, and output layer
        self.inodes = inputnodes
        self.hnodes = hiddennodes
        self.onodes = outputnodes
        
        # create weights linking the nodes from one layer to another
        self.wih = np.random.normal(0.0, pow(self.hnodes, -0.5), (self.hnodes, self.inodes))
        self.who = np.random.normal(0.0, pow(self.onodes, -0.5), (self.onodes, self.hnodes))
        
        # set the learning rate
        self.lr = learning_rate
        
        # set the activation function
        self.activation_function = lambda x: scipy.special.expit(x)
        
        pass
        
    
    def train(self, inputs_list, target_list):
        inputs = np.array(inputs_list, ndmin=2).T
        targets = np.array(target_list, ndmin=2).T
        
        # calculate signals into the hidden layer
        hidden_inputs = np.dot(self.wih, inputs)
        # calculate the weights out of the hidden layer
        hidden_outputs = self.activation_function(hidden_inputs)
        
        # calculate signals into the final output layer
        final_inputs = np.dot(self.who, hidden_outputs)
        final_outputs = self.activation_function(final_inputs)
        
        # get the output errror        
        output_errors = targets - final_outputs
        
        # hidden layer errors
        hidden_errors = np.dot(self.who.T, output_errors)
        
        # update the weights using the backpropagation formula for the middle layer to final layer
        self.who += self.lr * np.dot((output_errors * final_outputs * (1-final_outputs)),
                                        np.transpose(hidden_outputs))
        
        # update the weights for the wih layer
        self.wih += self.lr * np.dot((hidden_errors * hidden_outputs * (1-hidden_outputs)), np.transpose(inputs))
        
        pass
    
    def predict(self, input_list):
        inputs = np.array(input_list, ndmin=2).T
        
        # calculate signals into the hidden layer
        hidden_inputs = np.dot(self.wih, inputs)
        # calculate the weights out of the hidden layer
        hidden_outputs = self.activation_function(hidden_inputs)
        
        # calculate signals into the final output layer
        final_inputs = np.dot(self.who, hidden_outputs)
        final_outputs = self.activation_function(final_inputs)
        
        return final_outputs



In [17]:

    
# read the train and test data
# you can get the data from here: https://www.kaggle.com/c/digit-recognizer/data
import pandas as pd
train = pd.read_csv('data/mnist_train.csv', header=None)
test = pd.read_csv('data/mnist_test.csv', header=None)

x_train = train.loc[:, 1:]
y_train = train.loc[:, 0]
x_test = test.loc[:, 1:]
y_test = test.loc[:, 0]



In [18]:

    
pd.set_option('display.max_columns', 30)
pd.DataFrame(np.array(x_train.iloc[1]).reshape(28, 28))



In [19]:

    
# plot an example
import matplotlib.pyplot as plt
%matplotlib inline

index = 1
image_array = np.asfarray(x_train.iloc[index].values.reshape((28, 28)))
plt.imshow(image_array, cmap='Greys', interpolation = 'None')
print("actual outcome value: {}".format(y_train[index]))









    



actual outcome value: 0



In [20]:

    
# scale the input
x_train_scaled = x_train/255 * 0.99 + 0.01
x_test_scaled = x_test/255 * 0.99 + 0.01



In [21]:

    
### Setting up the network
input_nodes = 784
hidden_nodes = 100
output_nodes = 10

# learning rate
learning_rate = 0.2

# create the instance of the model
net = network(input_nodes, hidden_nodes, output_nodes, learning_rate)


train_targets = np.zeros((len(y_train), output_nodes)) + 0.01
train_targets[np.array(range(0, len(y_train))), y_train.values] = 0.99

We are implementing a type of gradient descent called stochastic gradient descent. This basically means that we are updating the weights after each training example. There are other types of gradient descent, and I highly recommend reading this paper by Ruder (2016) for an overview on different types of gradient descent.

Stochastic gradient descent has a few benefits.

It is usually faster. You converge to an optimal solution in a few epochs.
You can implement the algorithm in online learning examples.

Obviously, it is not the best numerical optimization algorithm out there. I recommend reading about ADAM (Kingma, Lei Ba 2015), which implements momentum and "memory" like strategies that allow for very rapid convergence with smaller likelihood at getting stuck within sub-optimal solutions.



In [22]:

    
for features, outcome in zip(x_train_scaled.values, train_targets):
    net.train(inputs_list=features, target_list=outcome)



In [23]:

    
predictions = np.argmax(net.predict(x_test_scaled).T, axis=1)
accuracy = np.mean(predictions == y_test.values)
print("observed accuracy {}".format(accuracy*100))









    



observed accuracy 94.93



In [10]:

    
### train for more epochs
epochs = 4

for epoch in range(epochs):
    print("Starting epoch {}".format(epoch+1))
    for features, outcome in zip(x_train_scaled.values, train_targets):
        net.train(inputs_list=features, target_list=outcome)









    



Starting epoch 1
Starting epoch 2
Starting epoch 3
Starting epoch 4



In [11]:

    
predictions = np.argmax(net.predict(x_test_scaled).T, axis=1)
accuracy = np.mean(predictions == y_test.values)
print("observed accuracy {}".format(accuracy*100))









    



observed accuracy 95.87



In [25]:

    
# Doing the same thing with an open source library: [Keras](https://keras.io/)
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense
from keras.optimizers import SGD


batch_size = 1
num_classes = 10
epochs = 2

# the data, shuffled and split between train and test sets
(x_train, y_train), (x_test, y_test) = mnist.load_data()

x_train = x_train.reshape(60000, 784)
x_test = x_test.reshape(10000, 784)
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255
print(x_train.shape[0], 'train samples')
print(x_test.shape[0], 'test samples')

# convert class vectors to binary class matrices
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)

model = Sequential()
model.add(Dense(100, activation='sigmoid', input_shape=(784,)))
model.add(Dense(10, activation='sigmoid'))

model.summary()

model.compile(loss='mean_squared_error',
              optimizer=SGD(),
              metrics=['accuracy'])

history = model.fit(x_train, y_train,
                    batch_size=batch_size,
                    epochs=epochs,
                    verbose=1,
                    validation_data=(x_test, y_test))
score = model.evaluate(x_test, y_test, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])









    



60000 train samples
10000 test samples
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense_9 (Dense)              (None, 100)               78500     
_________________________________________________________________
dense_10 (Dense)             (None, 10)                1010      
=================================================================
Total params: 79,510
Trainable params: 79,510
Non-trainable params: 0
_________________________________________________________________
Train on 60000 samples, validate on 10000 samples
Epoch 1/2
60000/60000 [==============================] - 85s - loss: 0.0656 - acc: 0.6225 - val_loss: 0.0446 - val_acc: 0.8035
Epoch 2/2
60000/60000 [==============================] - 85s - loss: 0.0374 - acc: 0.8349 - val_loss: 0.0309 - val_acc: 0.8694
Test loss: 0.0309349358231
Test accuracy: 0.8694



In [ ]:

	6	7	8	9	10	11	12	13	14	15	16	17	18	19	20	21	22
0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0
1	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0
2	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0
3	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0
4	0	0	0	0	0	0	0	0	0	51	159	253	159	50	0	0	0
5	0	0	0	0	0	0	0	0	48	238	252	252	252	237	0	0	0
6	0	0	0	0	0	0	0	54	227	253	252	239	233	252	57	6	0
7	0	0	0	0	0	10	60	224	252	253	252	202	84	252	253	122	0
8	0	0	0	0	0	163	252	252	252	253	252	252	96	189	253	167	0
9	0	0	0	0	51	238	253	253	190	114	253	228	47	79	255	168	0
10	0	0	0	48	238	252	252	179	12	75	121	21	0	0	253	243	50
11	0	0	38	165	253	233	208	84	0	0	0	0	0	0	253	252	165
12	0	7	178	252	240	71	19	28	0	0	0	0	0	0	253	252	195
13	0	57	252	252	63	0	0	0	0	0	0	0	0	0	253	252	195
14	0	198	253	190	0	0	0	0	0	0	0	0	0	0	255	253	196
15	76	246	252	112	0	0	0	0	0	0	0	0	0	0	253	252	148
16	85	252	230	25	0	0	0	0	0	0	0	0	7	135	253	186	12
17	85	252	223	0	0	0	0	0	0	0	0	7	131	252	225	71	0
18	85	252	145	0	0	0	0	0	0	0	48	165	252	173	0	0	0
19	86	253	225	0	0	0	0	0	0	114	238	253	162	0	0	0	0
20	85	252	249	146	48	29	85	178	225	253	223	167	56	0	0	0	0
21	85	252	252	252	229	215	252	252	252	196	130	0	0	0	0	0	0
22	28	199	252	252	253	252	252	233	145	0	0	0	0	0	0	0	0
23	0	25	128	252	253	252	141	37	0	0	0	0	0	0	0	0	0
24	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0
25	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0
26	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0
27	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0

	6	7	8	9	10	11	12	13	14	15	16	17	18	19	20	21	22
0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0
1	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0
2	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0
3	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0
4	0	0	0	0	0	0	0	0	0	51	159	253	159	50	0	0	0
5	0	0	0	0	0	0	0	0	48	238	252	252	252	237	0	0	0
6	0	0	0	0	0	0	0	54	227	253	252	239	233	252	57	6	0
7	0	0	0	0	0	10	60	224	252	253	252	202	84	252	253	122	0
8	0	0	0	0	0	163	252	252	252	253	252	252	96	189	253	167	0
9	0	0	0	0	51	238	253	253	190	114	253	228	47	79	255	168	0
10	0	0	0	48	238	252	252	179	12	75	121	21	0	0	253	243	50
11	0	0	38	165	253	233	208	84	0	0	0	0	0	0	253	252	165
12	0	7	178	252	240	71	19	28	0	0	0	0	0	0	253	252	195
13	0	57	252	252	63	0	0	0	0	0	0	0	0	0	253	252	195
14	0	198	253	190	0	0	0	0	0	0	0	0	0	0	255	253	196
15	76	246	252	112	0	0	0	0	0	0	0	0	0	0	253	252	148
16	85	252	230	25	0	0	0	0	0	0	0	0	7	135	253	186	12
17	85	252	223	0	0	0	0	0	0	0	0	7	131	252	225	71	0
18	85	252	145	0	0	0	0	0	0	0	48	165	252	173	0	0	0
19	86	253	225	0	0	0	0	0	0	114	238	253	162	0	0	0	0
20	85	252	249	146	48	29	85	178	225	253	223	167	56	0	0	0	0
21	85	252	252	252	229	215	252	252	252	196	130	0	0	0	0	0	0
22	28	199	252	252	253	252	252	233	145	0	0	0	0	0	0	0	0
23	0	25	128	252	253	252	141	37	0	0	0	0	0	0	0	0	0
24	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0
25	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0
26	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0
27	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0

	6	7	8	9	10	11	12	13	14	15	16	17	18	19	20	21	22
0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0
1	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0
2	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0
3	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0
4	0	0	0	0	0	0	0	0	0	51	159	253	159	50	0	0	0
5	0	0	0	0	0	0	0	0	48	238	252	252	252	237	0	0	0
6	0	0	0	0	0	0	0	54	227	253	252	239	233	252	57	6	0
7	0	0	0	0	0	10	60	224	252	253	252	202	84	252	253	122	0
8	0	0	0	0	0	163	252	252	252	253	252	252	96	189	253	167	0
9	0	0	0	0	51	238	253	253	190	114	253	228	47	79	255	168	0
10	0	0	0	48	238	252	252	179	12	75	121	21	0	0	253	243	50
11	0	0	38	165	253	233	208	84	0	0	0	0	0	0	253	252	165
12	0	7	178	252	240	71	19	28	0	0	0	0	0	0	253	252	195
13	0	57	252	252	63	0	0	0	0	0	0	0	0	0	253	252	195
14	0	198	253	190	0	0	0	0	0	0	0	0	0	0	255	253	196
15	76	246	252	112	0	0	0	0	0	0	0	0	0	0	253	252	148
16	85	252	230	25	0	0	0	0	0	0	0	0	7	135	253	186	12
17	85	252	223	0	0	0	0	0	0	0	0	7	131	252	225	71	0
18	85	252	145	0	0	0	0	0	0	0	48	165	252	173	0	0	0
19	86	253	225	0	0	0	0	0	0	114	238	253	162	0	0	0	0
20	85	252	249	146	48	29	85	178	225	253	223	167	56	0	0	0	0
21	85	252	252	252	229	215	252	252	252	196	130	0	0	0	0	0	0
22	28	199	252	252	253	252	252	233	145	0	0	0	0	0	0	0	0
23	0	25	128	252	253	252	141	37	0	0	0	0	0	0	0	0	0
24	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0
25	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0
26	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0
27	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0

	6	7	8	9	10	11	12	13	14	15	16	17	18	19	20	21	22
0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0
1	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0
2	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0
3	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0
4	0	0	0	0	0	0	0	0	0	51	159	253	159	50	0	0	0
5	0	0	0	0	0	0	0	0	48	238	252	252	252	237	0	0	0
6	0	0	0	0	0	0	0	54	227	253	252	239	233	252	57	6	0
7	0	0	0	0	0	10	60	224	252	253	252	202	84	252	253	122	0
8	0	0	0	0	0	163	252	252	252	253	252	252	96	189	253	167	0
9	0	0	0	0	51	238	253	253	190	114	253	228	47	79	255	168	0
10	0	0	0	48	238	252	252	179	12	75	121	21	0	0	253	243	50
11	0	0	38	165	253	233	208	84	0	0	0	0	0	0	253	252	165
12	0	7	178	252	240	71	19	28	0	0	0	0	0	0	253	252	195
13	0	57	252	252	63	0	0	0	0	0	0	0	0	0	253	252	195
14	0	198	253	190	0	0	0	0	0	0	0	0	0	0	255	253	196
15	76	246	252	112	0	0	0	0	0	0	0	0	0	0	253	252	148
16	85	252	230	25	0	0	0	0	0	0	0	0	7	135	253	186	12
17	85	252	223	0	0	0	0	0	0	0	0	7	131	252	225	71	0
18	85	252	145	0	0	0	0	0	0	0	48	165	252	173	0	0	0
19	86	253	225	0	0	0	0	0	0	114	238	253	162	0	0	0	0
20	85	252	249	146	48	29	85	178	225	253	223	167	56	0	0	0	0
21	85	252	252	252	229	215	252	252	252	196	130	0	0	0	0	0	0
22	28	199	252	252	253	252	252	233	145	0	0	0	0	0	0	0	0
23	0	25	128	252	253	252	141	37	0	0	0	0	0	0	0	0	0
24	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0
25	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0
26	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0
27	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0